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I « ^ f*"" "sw^mnending items to users using automated collabwative filterine 

stores profiles of useis relating ratings to items in memory. Profiles of items are also stored 
n memory the item profiles associating users with the rating given to the item by that 
user Sunilanty factors with respect to other users are calculated for a user, and these 
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METHOD AND APPARATUS FOR ITEM RECOMMENDATION USING 
AUTOMATED COLLABORATIVE FILTERING 



Cross-Referftnr.ft t p Related App Hcatinns 

This application claims the benefit of prior filed co-pending provisional applications Serial 
No. 60/000.598. filed June 30. 1995, and Serial No. 60/008.458. filed December 1 1. 1995, ^vhich 
are both incorporated herein by reference. 

Field of the Invention 

The present invention relates to a method and apparatus for recommending items and. in 
particular, to a method and apparatus for recommending items using automated collaborative 
filtering and feature-guided automated collaborative filtering. 

Backpround of the Invftn^ inn 
The amount of information, as well as the number of goods and services, available to 
individuals is increasing exponentially. This increase in items and information is occurring across 
all domains, e.g. sound recordings, restaurants, movies, Worid Wide Web pages, clotiiing stores, 
etc. An individual attempting to find usefiil information, or forced to decide between competing 
goods and services, is often faced with a bewUdering selection of sources and choices. 

Individual sampling of all items, even in a particular domain, may be impossible. For 
example, sampling every restaurant of a particular type in New York City would tax even Uie 
most avid diner. Such a sampling would most likely be prohibitively expensive to carry out. and 
tiie diner would have to suflFer tiirough many unenjoyable restaurants. 

In many domains, individuals have simply learned to manage infonnation overload by 
relying on a fonn of generic referral system. For example, in the domain of movie and sound 
recordings, many individuals rely on reviews written by paid reviewers. These reviews, however, 
are simply the viewpoint of one. or perhaps two. individuals and may not have a likelihood of 
correlating with how the individual wUl actually perceive Uie movie or sound recording. Many 
individuals may rely on a review only to be disappointed when they actually sample tiie item. 
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One method of attempting to provide an efficient filtering mechanism is to use content- 
based fihering. The content-based filter selects items from a domain for the user to sample based 
upon correlations between the content of the item and the user's preferences. Content-based 
filtering schemes suflFer from the drawback that the items to be selected must be of some machine- 
readable form, or attributes describing the content of the item must be entered by hand. This 
makes content based filtering problematic for existing items such as sound recordings, 
photographs, art. video, and any other physical item that is not inherently machine-readable. 
While item attributes can be assigned by hand in order to allow a content-based search, for many 
domains of items such assigmnent is not practical. For example, it could take decades to enter 
even the most rudimentary attributes for all available network television video clips by hand. 

Periiaps more importantly, even the best content-based filtering schemes camiot provide 
an analysis of the quality of a particular item as it would be perceived by a particular user, since 
quality is inherently subjective. So. while a content-based filtering scheme may select a number of 
items based on the content of those items, a content-based filtering scheme generally cannot 
further refine the list of selected items to recommend items that the individual will enjoy. 

Summary of thp. invonrinn 
The present invention, in one aspect, relates to a method for recommending an item to a 
user which the user has not yet rated. A profile for each user is stored in a memory, and each user 
profile includes ratings given to items by the user. A profile for each item is stored in a memory, 
and each item profile includes ratings given to the item by the users. 

A set of similarity factors are calculated for each user; each similarity factor represents the 
degree of agreement between two users* opinions for all items. Using these similarity fectors, a 
set of neighboring users is selected for each user. The neighboring users are assigned a weight. 
Using the weights and the ratings given to items by the neighboring users, a recommendation for 
an item not yet rated by a user is made. 

In some embodimems, the calculating step comprises receiving a rating from one of the 
users for one of the items. Both the rating user's profile and the rated item's profile are updated 
with the received rating, and a new set of similarity factors are calculated for the rating user. In 
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other embodiments, ratings for items are received via a local area network or a wide area 
network. 

In one embodiment, once a rating for an item is received from a user, the rated item's 
proffle is retrieved to determine other users that have also rated the item. A new plurality of 
similarity factors is calculated for the rating user only witi, respect to the users that have also 
rated the item. 

In certain embodiments, the similarity factors for a particular user are calculated by 
retrieving an item profile, determining otiier users that have also rated the item, and calculating 
simUarity factor between tiie particular user and each of tiie otiier users tiiat have also rated the 
item. These steps are repeated until aU items rated by the particular user have been retrieved and 
all similarity factors for tiie user have been calculated. These steps, in turn, are repeated until 
similarity factors for all users have been calculated. 

In one embodiment, tiie similarity factors are calculated by subtracting tiie rating given to 
tiie item by each of the otiier users from tiie rating given to tiie item by tiie requesting user, 
squaring each rating difference, and dividing the sum of tiie squared differences by tiie number of 
other users that have also rated the item. 

In another embodiment, tiie similarity factors may be a Pearson r correlation coefficient 
between users. In tiiis embodiment, tiie simUarity factor between user x and user y is tiien given 
by dividing tiie sum of tiie products of tiie mean score given to items by user x subtracted from 
tiie rating given to each item by user x and tiie mean score given to items by user y subtracted 
from tiie rating given to each item by user y by the square root of tiie product of two sums. The 
first sum is tiie squared differences between tiie rating given by user x to each item and tiie mean 
score given to all items by user x. The second sum is the squared differences between tiie rating 
given by user y to each item and tiie mean score given by user y to all items. 

In otiier embodiments, users are selected as neighboring users when tiie simUarity factor 
between two users is less tiian a predetermined threshold value. In still otiier embodiments, a 
weight is assigned to neighboring users by subtracting, for each neighboring user, tiie simUarity 
factor for tiiat neighboring user from tiie predetermined tiireshold value and dividing each 
difference by the predetermined threshold value. 



wo 97/02537 ^ w PCT/US96/10492 



-4- 

In some other embodiments, an item is recommended by predicting a rating for each item 
not yet rated by a user. The predicted rating is arrived at by taking a weighted average of the 
ratings given to the items by the user's neighboring users. Items are then recommended to the 
user based on the predicted ratings. 

In one embodiment, the user selects an item and a rating is predicted by taking a vtreighted 
average of the ratings given to the selected item by the user's neighboring users. 

In certain embodiments, information about the recommended items can be provided on a 

display. 

In another aspect, the present invention relates to a method for recommending an item to a 
user which has not yet been rated by the user, each item belonging to at least one group of items. 
A set of similarity factors for each user is calculated, representing the degree of agreement in item 
ratings between users within different groups. Neighboring users are selected within each group, 
a weight is assigned to each of the neighboring users for each group, and items are recommended 
based on the weights assigned to the user's neighboring users and the ratings given to the unrated 
item by the user's neighboring users. 

In some embodiments, the similarity factors are calculated by retrieving the item profile for 
an item that has been rated by a user, determining other users that have also rated the item, 
calculating a similarity factor between the user and the other users that have also rated the item, 
and repeating these steps until aU items rated by the user in a group have been retrieved. These 
steps, in turn, are repeated until similarity factors for each user have been calculated. 

In one embodiment, the similarity factors are calculated only using ratings for other items 
belonging to the same group. 

In other embodiments, each selected neighboring user has a similarity factor less than a 
predetermined threshold value and neighboring users are assigned weights by subtracting, for each 
neighboring user, the similarity factor for that neighboring user fi-om a predetennined threshold 
value and dividing each difference by the predetermined threshold value. 

In still other embodiments, a rating is predicted for an item in a group not yet rated by a 
user by taking a weighted average of the ratings given to the items in the group by the user's 
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neighboring users, and recommending a predetermined number of items from the group based on 
the predicted ratings for those items. 

An item may be selected by the user, in which case a rating is predicted for the selected 
item by taking a weighted average of the ratings given to the selected item by the user's 
neighboring users within the group. The user may also select a particular group for which to 
receive recommendations, in which case, a rating is predicted for items in the selected group by 
taking a weighted average of the ratings given to the items in the group by the user's neighboring 
users for that group. A predetermined number of items are recommended based on their 
predicted rating. 

In certain embodiments, items belong to more than one group. In other certain 
embodiments, information about recommended items is provided on a display. 

In another aspect, the present invention relates to a method for recommending other users 
to a user. User profiles are stored in a memory for each user, and each user profile includes 
ratings given to items by the user. Item profiles are also stored in a memory for each of the items, 
and each of the items belongs to one of a plurality of groups. Each item profile includes a 
plurality of ratings given to the item by one of the users. 

Similarity factors are calculated for each user, each similarity factor representing the rating 
similarity between each user and another one of the users for a particular group. Neighboring 
users are recommended to a user based on the similarity factors. 

In a certain embodiment, a neighboring user is recommended based on the similarity 
factors and the number of items rated by both users. 

In another aspect, the present invention relates to a method for recommending an item, the 
item having at least one feature defining a characteristic of the item and having more than one 
possible value. The possible values may be grouped into clusters of feature values. User profiles 
are stored in a memory for each user, and the user profiles include ratings given to items by each 
user. Item profiles are also stored in a memory for each of the items, and the item profiles include 
ratings given to each item by the users. 
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A weight is assigned to each cluster of feature values for each user. A weight is also 
assigned to each feature for each user. Using the feature weights, the feature value cluster 
weights, and the ratings given to items by the users, a set of similarity factors is calculated for 
each user. For each user, a set of neighboring users is selected responsive to the similarity 
fectors; a weight is assigned to each of the neighboring users, and items are recommended to a 
user based on the weights assigned to the user's neighboring users and the ratings given to the 
unrated item by the user's neighboring users. 

In some embodiments a weight is assigned, for each user, to each value chister based on the 
rating given to the item by the user and the number of feature value clusters present. 

In other embodiments, a weight is assigned, for each user, to each feature. This can be 
done based on the number of features defined for the item, or a feature weight may be assigned by 
based on the weights assigned to each feature value cluster. In particular, the feature weights can 
be assigned by dividing the standard deviation of the cluster weights by the mean of the cluster 
weights. 

In still other embodiments, a similarity factor is calculated for each feature value with 
respect to all other feature values, and the feature values are grouped based on the feature value 
similarity factors. 



a 
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In another aspect, the present invention relates to a method for recommending an item to 
user, the item having at least one feature defining a characteristic of the item and having 
plurality of possible values that may be grouped into clusters of feature values. User profil 
stored in a memory for each user, and the user profile includes ratings given to items by the 
Item profiles are stored in a memory for each items, and the item profile includes ratings given to 
the item by the users. 

A weight is assigned to each feature value cluster for each user. A weight is assigned to 
each feature for each user. Using the feature weights and the feature value cluster weights, 
similarity faaors between items are calculated for a particular user. An item for which a fevorable 
rating has been received fi-om the user is selected, and a number of items are recommended to the 
users responsive to the item similarity factors. 
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Id another aspect, the present invention relates to an article of manufacture having 
program means for recommending an item embodied tiiereon. The article includes computer- 
readable program means for storing a user profile in a memory; the user profile includes ratings 
given to tiie items by the user. The article also includes computer-readable program means for 
storing an item profile in a memory; the item profile includes ratings given to tiie item by tht 
users. 

Also included are computer-readable program means for calculating similarity factors for 
each user, each of the similarity factors representing tiie similarity between each user and anotiier 
one of the users, and computer-readable program means for selecting neighboring users for each 
user responsive to the similarity factors. 

The article also includes computer-readable program means for assigning a weight to each 
of the neighboring users, as weU as computer-readable program means for recommending an item 
to one of the users based on the weights assigned to tiie user's neighboring users and tiie ratings 
given to the unrated item by the user's neighboring users. 

In yet anotiier aspect, tiie present invention relates to an article of manufacture including 
computer-readable program means for predicting a rating for an item, tiie item having at least one 
feature defining a characteristic of tiie item and having a plurality of possible values tiiat may be 
grouped into clusters of feature values. 

Also included on tiie article are: computer-readable program means for storing a user 
profile in a memory for each user, each user profile including ratings given to tiie of items by tiie 
user; computer-readable program means for storing an item profile in a memory for each item, 
each item profile including ratings given to tiie item by tiie users; computer-readable program 
means for assigning, for each user, a weight to each value cluster witiiin each feature; computer- 
readable program means for assigning, for each user, a weight to each feature; computer-readable 
program means for calculating a set of similarity factors for each user, each similarity factor based 
on tiie feature weights, tiie cluster weights, and tiie ratings given to items by tiie respective users; 
computer-readable program means for selecting, for each user, a set of neighboring users 
responsive to tiie similarity factors; computer-readable program means for assigning a weight to 
each of tiie neighboring users; and computer-readable program means for recommending an item 
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to one of the users based on the weights assigned to the user's neighboring users and the ratings 
given to the unrated item by the user's neighboring users. 

In another aspect, the present invention relates to an apparatus for recommending an item 
to a user which has not yet rated the item. The apparatus has a memory element for storing i 
profiles; each user profile includes ratings given to items by the user. A memory element for 
storing item profiles is also included; each item profile includes ratings given to the item by the 



:user 



users. 



The apparatus also includes means for calculating similarity factors for each user, each of 
the similarity fartors representing the similarity between each user and another one of the users, 
and means for selecting, for each of the users, a set of neighboring users responsive to the 
amilarity factors. 

The apparatus also has means for assigning a weight to each of tiie neighboring users; and 
means for recommending at least one of the items to one of the users based on the weights 
assigned to the user's neighboring users and the ratings given to the unrated item by the user's 
neighboring users. 

In yet another aspect, the present invention relates to an apparatus for recommending an 
item to a user, the item having at least one feature defining a characteristic of the item and having 
many possible feature values tiiat may be grouped. The apparams includes a memory element for 
storing user profiles; each user profile includes ratings given to Uie items by the user. A memory 
element for storing item profiles is also included; each item profile includes ratings given to tiie 
item by the users. Means for assigning a weight to each value cluster witiiin each feature is also 
provided. 

The apparatus also includes: means for assigning, for each user, a weight to each feature; 
means for calculating similarity factors for each user, each of the similarity factors based on tiie 
feature weights, tiie cluster weights, and the ratings given to items by the respective users; means 
for selecting, for each of the users, a plurality of neighboring users responsive to tiie similarity 
factors; means for assigning a weight to each of tiie neighboring users; and means for 
recommending an item to one of the users based on tiie weights assigned to tiie user's neighboring 
users and tiie ratings given to tiie unrated item by tiie user's neighboring users. 
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Brief Descri ption of tha nra^nn ^c 

This invention is pointed out with particularity in the appended claims. The above and 
further advantages of this invention may be better understood by referring to the following 
description taken in conjunction with the accompanying drawings, in which: 

no. 1 is a flowchart of one embodiment of the method of the presem invention; 

HG. 2 is a flowchart of one embodiment of the method of the present invention; 

no. 3 is a block diagram of an embodiment of the apparatus of the present invention; and 

FIG. 4 is a block diagram of the Internet system on which the preferred methods and 
apparatus may be used. 

Detailed Dpjarription of thp Invpntmn 
As referred to m this description, items to be recommended can be items of any type that a 
user may sample, such as somid recordings, movies, restaurants, vacation destinations, novels, or 
Worid Wide Web pages. When reference is made to a "domain." it is intended to refer to any 
category or subcategory of ratable items, such as sound recordings, movies, or restaurants m a 
particular city. Referring now to FIG. 1. a method for recommending items begins by storing user 
and item information in profiles. 

A pluraUty of user profiles is stored in a memory element (step 102). It is preferred that 
one profile is created for each user, however, multiple profiles may be created for a user to 
represent that user over multiple domains. The memory element can be any memory element 
known in the art that is capable of storing sufficient data for the plurality of profiles and allowing 
the profiles to be updated, such as disc drive or random access memory. 

Each user profile associates ratings given to an item by the user with a particular item. 
User profiles can be any data construct that faciUtates this association, such as an array, although 
it is preferred to provide user profiles as sparse vectors of ordered pairs. Each ordered pair 
contams a number representing the rated item and a number representing the rating that the user 
gave to the item. 

In the preferred embodiment, a profile for a user is created when that user first begins 
rating items, although in multi-domain applications user profiles may be created for particul; 
domains only when the user begins to explore, and rate items within, those domains. Whenever 



ar 
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user profile is created, a number of ratings for items are solicited fi-om the user. This can be done 
by providing the user with a particular set of items to rate corresponding to a particular group of 
items. Groups are genres ofitems and are discussed below in more detaU. Other methods of 
soliciting ratings from the user include: manual entiy of item-rating pairs, in which the user simply 
submits a list ofitems and ratings assigned to tiiose items; soliciting ratings by date of entry into 
the system, i.e., asking the user to rate the newest items added to the system; soliciting ratings of 
the most rated items; or by allowing a user to rate items similar to an initial item selected by the 
user. The preferred embodiment uses all of tiie methods described above and allows the user to 
select the particular method they wish to employ. 

Profiles are stored for each item that has been rated by at least one user (step 104). Each 
item profile records how particular users have rated this particular item. Any data construct that 
associates ratings witii certain users can be used. It is preferred is to provide item profiles as a 
sparse vector of ordered pairs. Each ordered pair contains a number representing a particular 
user and a number representing tiie rating that user gave to tiie item. Item profiles are created 
when tiie first rating is given to an item. Although HG. 1 shows item profiles being stored after 
user profiles, tiie two may be stored in any order, and can occur simultaneously. 

A similarity fattor for each user is calculated witii respect to all otiier users (step 106), and 
represents tiie degree of similarity between any two users witii respect to all items. It is currently 
preferred tiiat tiie more similar two users are, the closer the similarity factor is to zero. 
Specialized hardware may be provided for calculating the similarity factors between users, 
altiiough it is preferred to provide a general-purpose computer witii appropriate programming to 
calculate the similarity factors. 

The similarity factor may be calculated by comparing a user's profile with tiie profile of 
every other user. This is computationally intensive, however, and it is preferred to calculate the 
simUarity factor by first examining tiie profiles of each item for which a user has entered a rating 
and determining which otiier users have also rated each item. The similarity factors between tiie 
user and tiie otiier users tiiat have also rated tiie items are tiie only similarity factors updated. 
These steps are repeated for all users. 
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Ratings for items are received from users. Item ratings can be of any form that allows 
users to record subjective impressions of items based on their experience of the item. For 
example, items may be rated on an alphabetic scale ("A" to 'T") or a numerical scale (1 to 10). It 
is currently preferred that ratings are integers between 1 (lowest) and 7 (highest). Ratings can be 
received as input to a stand-alone machine, as input to a system via electronic mail, or as input to 
a system via a local area or wide area network. In the currenUy preferred embodiment, ratings 
received as input to a World Wide Web page. Ratings can be received from users singularly 
batches, and may be received from any number of users simultaneously. 



are 
or in 



Whenever a rating is received from a user, the profile of the user who entered the rating 
must be updated as weU as the profile of the item rated by the user. Profile updates may be stored 
in a temporary memory location and entered at a convenient time. It is prefored. however, to 
update profiles whenever a new rating is entered by a user. Ratings are updated by appending a 
new pair of values to the set of already existing ordered pairs in the profile. 

Whenever a user's profile is updated because a rating has been received from the user, 
new similarity factors between the user and all other users of this system must be calculated. This 
can be done by comparing the ratings given to items by the user to the ratings given to items by all 
other users. However, it is currently preferred to reduce the amount of computation necessary in 
the foUowing manner. Whenever a user enters a rating for an item, that item's profile is retrieved 
and the identity of other users that have also rated the item is determined. In this way. only the 
similarity factors between the rating user and other users that have also rated the item are 
updated. 

Any number of methods can be used to calculate the sinularity factors. In general, a 
method for calculating similarity factors between users should minimize the average error between 
a predicted rating for an item and the rating a user would actually have given the item. 

It is also desirable to reduce error in cases involving "extreme" ratings. That is, a method 
which predicts fairiy well for item ratings representing ambivalence towards an item but which 
does pooriy for item ratings representing extreme enjoyment or extreme disappointment with an 
item is not usefiil for recommending items to users. 
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Similarity factors between users refers to any quantity which expresses the degree of 
correlation between two user's profiles. The foUowing methods for calculating the similarity 
factor are intended to be exemplary, and in no way exhaustive. Depending on the item domain, 
different methods wiU produce optimal results, since users in different domains wiU tolerate 
different expected errors. 

In the foUowing description of methods, represents the sunilarity factor calculated 
between two users, x and y. Hi, represents the rating given to item i by user x, I represents aU 
items in the database, and cj. is a Boolean quantity which is 1 if user x has rated item i and 0 if 
user X has not rated that item. 

One method of calculating the similarity between a pair of users is to calculate the average 
squared difference between their ratings for mutuaUy rated items. Thus, the simUarity factor 
between user x and user y is calculated by subtracting, for each item rated by both users, the 
rating given to an item by user y from the rating given to that same item by user x and squaring 
the difference. The squared differences are summed and divided by the total number of items 
rated. This method is represented mathematically by the following expression: 

D^ = J^ 



I 



A similar method of calculating the similarity factor between a pair of users is to divide the 
sum of their squared rating differences by the number of items rated raised to a power greater 
than 1 . This method is represented by the following mathematical expression: 

where |C^| represents the number of items rated by both users and k is greater than 1 

A third method for calculating the sunilarity factor between users attempts to factor into 
the calculation the amount of overiap between two user profiles. Thus, for each item rated by 
both users, the rating given to an item by user y is subtracted from the rating given to that same 
item by user x. These differences are squared and then summed. The amount of profile overiap is 
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taken into account by dividing the sum of squared rating differences by a quantity equal to the 
number of items mutually rated by the users subtracted from the sum of the number of items rated 
by user x and the number of items rated by users y. This method is expressed mathematically by: 



iel lel 

where \Cxy\ represents the number of items mutually rated by users x and y. 

In another embodiment, the shnilarity factor between two users is a Pearson r correlation 
coefficient. Alternatively, the simUarity factor may be calculated by constraining the correlation 
coefficient with a predetermined average rating value, A. Using the constrained method, the 
correlation coefficient, which represents D^. is arrived at in the following manner. For each item 
rated by both users. A is subtracted from the rating given to the item by user x and the rating 
given to that same item by user y. Those diffisrences are then multipUed. The summed product of 
rating differences is divided by the product of two sums. The &st sum is the sum of the squared 
differences of the predefined average ratmg value, A, and the rating given to each item by user x. 
The second sum is the sum of the squared differences of the predefined average vahie, A. and the 
rating given to each item by user y. This method is expressed mathematically by: 



D = 



'6U. <6t/. 



Where represents all items rated by x, U, represents all items rated by y, and Cxy represents all 
items rated by both x and y. 

Regardless of the method used to generate them, the similarity factors are used to select a 
plurality of users that have a high degree of correlation to a user (step 108). These users are 
called the user's "neighboring users." The neighboring users are selected from all other users 
based on having a similarity factor with respect to the requesting user less than a predetermined 
threshold value. L. The threshold value. L. can be set to any value which improves the predictive 
capabUity of the method. In general, the value of L will change depending on the method used to 
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calculate the simUarity factors, the item domain, and the size of the number of ratings that have 
been entered. 

A weight is assigned to each of the neighboring users (step 1 10). In the preferred 
embodiment, the weights are assigned by subtracting the similarity factor calculated for each 
neighboring user from the threshold value and dividing by the threshold value. This provides a 
user weight that is higher, i.e. closer to one. when the similarity factor between two users is 
smaller. Thus, similar users are weighted more heavily than other, less similar, users. 

Once weights are assigned to the neighboring users, an item is recommended to a user 
(step 1 12). For appUcations in which positive item recommendations are desired, items are 
recommended if the user's neighboring users have also rated the item highly. For an application 
desiring to warn users away from items, items are displayed as recommended against when the 
user's neighboring users have also given poor ratings to the item. Once again, although 
specialized hardware may be provided to select neighboring users and weight them, it is currently 
preferred to provide an appropriately programmed general-purpose computer to provide these 
ftmctions. 

The item to be recommended may be selected in any feshion. so long as the ratings of the 
neighboring users and their assigned weights are taken into account. In one embodiment, a rating 
is predicted for each item that has not yet been rated by the user. This predicted rating is arrived 
at by taking a weighted average of the ratings given to those items by the user's neighboring 
users. A predetermined number of items are then recommended to the user based on the 
predicted ratings. 

The predetermined number of items can be selected such that those items having the 
highest predicted rating are recommended to the user. Alternatively, the predetermmed number 
of items may be selected based on having the lowest predicted rating of all the items. 

In another embodiment the user selects an item for which a predicted rating is desired. A 
rating is predicted by taking a weighted average of the ratings given to that item by the user's 
neighboring users. 
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Whatever method is used, infonnation about the recommended items can be displayed to 
the user. For example, in a music domain, the system may display a Ust of recommending albums 
including the name of the recording artist, the name of the album, the record label which made the 
album, the producer of the album, "hit" songs on the album, and other infonnation. In the 
embodiment in which the user selects an item and a rating is predicted for that item, the system 
may display the actual rating predicted, or a label representing the predicted rating. For example, 
instead of displaying 6.8 for the predicted rating, a system may instead display "highly 
recommended". 

In one embodiment, items are grouped in order to help predict ratings and increase 
recommendation certainty. For example, in the broad domain of music, recordings may be 
grouped according to various gem-es, such as "opera," "pop," "rock," and others. Groups are 
used to unprove performance because predictions and recommendations for a particular item are 
made based only on the ratings given to other items withm the same group. CSroups may be 
determined based on information entered by the users, however it is currently preferred to 
generate the groups using the item data itself 

Generating the groups using the item data itself can be done in any manner which groups 
items together based on some diflFerentiating feature. For example, in the item domain of music 
recordings, groups could be generated corresponding to "pop," "opera," and others. 

In the preferred embodiment, item groups are generated by, first, randomly assigning all 
items in the database to a number of groups. The number of desired groups can be predetermined 
or random. For each initial group, calculate the centroid of the scores for items assigned to that 
group. This can be done by any method that determines the approximate mean value of the 
spectrum of scores contained in the item profiles assigned to the initial group, such as 
eigenanalysis. It is currently preferred is to average all values present in the initial group. 



IIS 



After calculating the group centroids, determine to which group centroid each item i 
closest, and move it to that group. Whenever an item is moved in this manner, recalculate the 
centroids for the affected groups. Iterate until tiie distance between all group centroids and items 
assigned to each group are below a predetermined threshold or until a certain number of iterations 
have been accomplished. 
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A method using grouping to improve performance calculates simUarity factors for a user 
with respect to other users in a particular group (step 106). For example, a user may have one 
similarity factor with respect to a second user for the "pop" grouping of music items and a second 
similarity factor with respect to that same user for the "opera" grouping of music items. This is 
because the "pop" simUarity factor is calculated using only ratings for "pop" items, whUe the 
"opera" similarity fector is calculated only for "opera" items. Any of the methods described 
above for calculating similarity fectors may be used. 

The neighboring users are selected based on the similarity fectors (step 108). The 
neighboring users are weighted, and recommendations for items are arrived at (steps 110 and 
1 12) as above. A weighted average of the ratings given to other items in the group can be used to 
recommend items both inside the group and outside the group. For example, if a user has a high 
correlation with another user in the "pop" grouping of music items (the similarity fector between 
the users is close to 0), that similarity fector can be used to recommend music items inside the 
"pop" groupmg, since both users have rated many items in the group. The similarity factor can 
also be used to recommend a music item outside of the group, if one of the users has rated an 
item in another group. Alternatively, a user may select a group, and a recommendation list will be 
generated based on the predicted rating for the user's neighboring users in that group. 



Whether or not grouping is used, a user or set or users may be recommended to a user as 
simUar in taste. In this case, the simUarity factors calculated from the user profiles and item 
profiles are used to match simUar users and introduce them to each other. This is done by 
recommending one user to another in much the same way that an item is recommended to a user. 
It is possible to increase the recommendation certainty by including the number of items rated by 
both users in addition to the simUarity factors calculated for the users. 

Grouping is a special case of "feature-guided automated collaborative filtering" when 
there is only one feature of interest. In the example above, the feature of interest was genre of 
music. The method of the present invention works equally well for item domains in which the 
items have multiple features of interest. 



The method using feature-guided automated collaborative filtering incorporates feature 
values associated with items in the domain. The term "feature value" is used to describe any 
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information stored about a particular feature of the hem. For example, a feature value may have 
boolean feature vahies mdicating whether or not a particular feature exists or does not exist in a 
particular item. 

Alternatively, features may have numerous values, such as tenns appearing as "keywords" 
in a document. In some embodiments, each feature vahie can be represented by a vector in some 
metric space, where each term of the vector corresponds to the mean score given by a user to 
items having the feature value. 



one 



IdeaUy, it is desirable to calculate a vector of distances between every pair of users, 
for each possible feature value defined for an item This may not be possible if the number of 
possible feature values is very large. i.e.. keywords in a document, or the distribution of feature 
values is extremely spanse. Thus, in many applications, it is desirable to duster feature values. 
The terms "cluster" and "feature value chrster" are used to indicate botii individual featiire values 
as well as feature value chisters. even though feature values may not necessarily be chistered. 

Feature value clusters are created by defining a distance fiinction A. defined for any two 
pomts m tiie vector space, as weU as vector combination fiinction CI, which combmes any two 
vectors in the space to produce a third point in the space that in some way represents the average 
of the points. Although not lunited to the examples presented, tiiree possible formulations of A 
and n are presented below. 

The notion of similarity between any two feature values is how similarly they have been 
rated by the same user, across the whole spectrum of users and items. One method of defining 
the simUarity between any two feature values is to take a simple average. Thus, we define the 
value vT to be the mean of tiie rating given to each item containing feature value FV; that user 
i has rated. Expressed mathematically. 



Unkfined otherwise 
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Where T ^ indicates the presence or absence of feature value FV^ in item p. Any distance metric 
may be used to determine the per-user dimension squared distance between vectors feature value 
Ox and feature value Oy for user i. For example, any of the methods referred to above for 
calculating user similarity may be used. 

Defining 5 as the per-user dimension squared distance between two feature values, the 
total distance between the two feature value vectors is e;q)ressed mathematically as: 



( M )x(""V.) 



where, the term 



y 



represents adjustment for missing data. 



The combination fiinction for the two vectors, which represents a kind of average for the 
two vectors, is expressed mathematically by the following three equations. 



V y«+ V y 



2 

V ' 
J 

V y 



/ if7jy = i and 7^=1 
if Vy^i and 77"/= 0 
/ if 7jy-0 and 77*^=1 

wherem 77°- indicates whether y J' is defined. 

Another method for calculating the similarity between any two feature values is to assume 

the number of values used to compute y J' is sufficiently large. If this assumption is made, the 

Central Limit Theorem can be used to justify approximating the distribution of vectors by a 
Gaussian distribution. 
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Since the Gaussian distribution can be effectively characterized by its mean, variance and 
sample size, each entry is now a triplet. 



V, J ,n; 



where 



M7' = 



is the sample mean of the population, 



. i:!r(K,-A,-r.<^,.,xr;-) 



is the variance of the sampling distribution, and 



\ltm\ 

^/•=Z(c.,xr;') 



is the sample size. 

The total distance between the two feature value vectors is expressed 
mathematically by: 



psersi 



The feature value combination function combines the corresponding triplets from 
the two vectors by treating them as gaussians, and therefore is represented mathematically by: 
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<Mr, o•^ , N,'-'' > if 77,- = 1 and 77,'' = 1 
< mI' , <^'l' , N'' > if Tj^' = 0 and 77,"' = 1 



where 



represents the mean of the new population. 




represents the variance of the combined population, and 



represents the sample size of the combined population. 

The third method of calculating feature value similarity metrics attempts to take into 
account the variance of the sampling distribution when the sample size of the population is small. 
A more accurate estimator of the population variance is given by the term 

Zr((^., -/-?•)' xc,..<r;-) 
(Zr(c.^r;.))-. 

and represents the sample variance, which is an accurate estimator of the underlying 
population variance. 



Accordingly operator 77** is redefined as: 



wo 97/02537 

^ PCT/OS96/10492 



-21 - 
\ iffvl"""" 
0 Otherwise 



and the triplet is defined as: 



Given the above, the sample variance is represented 



as; 



,..._ zr(K,-^;-rxc..,xr;.) 

The sample variance and the variance of the sample distribution for a finite population 
related by the following relationship: 



are 



which transforms the standard deviation into: 



N^' - 1 



^ r 

2«. 



x5T + 



N^' - 1 



Thus, the feature value vector combmation function is defined 



as: 



<Mj-^,S'r ,N:'^ > ifrjl'^ land;7;'= 1 
< Ml- ,S^1\ Nf' > if rj";- = 1 and Til' = 0 
<m';',S":^,N^' > if T/;- = 0 and 17^ = 1 



Regardless of the feature value combination fimction used, the item similarity metrics 
generated by them are used to generate feature vahie clusters. Feature value clusters are 
generated fi-om the item similarity metrics using any clustering algorithm known in the art. For 
example, the method described above with respect to grouping items could be used to group 
values within each feature. 
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Feature values can be clustered both periodically and incrementally. Incremental 
clustering is necessary when the number of feature values for items is so large that reclustering of 
all feature values cannot be done conveniently. However, incremental clustering may be used for 
any set of items, and it is preferred to use both periodic reclustering and incremental reclustering. 

All feature values are periodicaUy reclustered using any clustering method known in the 
art. such as K-means. It is preferred that this is done infrequentiy. because of the time that may 
be required to complete such a reclustering. In order to cluster new feature values present in 
items new to the domain, feature values are incrementally clustered. New feature values present 
in the new items are clustered into the akeady existing feature value clusters. These feature 
values may or may not be rechistered into another feature value cluster when the ne>ct complete 
reclustering is done. 

Usmg the feature value clusters generated by any one of the methods described above, a 
method for recommending an item, as shown in HG. 2, uses feature chisters to aid in predicting 
ratings and proceeds as the method of HG. 1, in that a plurality of user profiles is stored (step 
102') and a plurality of item profiles are stored (step 104'). The method using feature value 
clusters assigns a weight to each feature value cluster and a weight to each feature based on the 
users rating of the item (steps 120 and 122). 

A feature value cluster weight for each cluster is calculated for each user based on the 
user's ratings of items containing that chister. The cluster weight is an indication of how 
important a particular user seems to find a particular feature value cluster. For example, a feature 
for an item in a music domain might be the identity of the producer. If a user rated highly all 
items having a particular producer (or cluster of producers), then the user appears to place great 
emphasis on that particular producer (feature value) or cluster of producers (feature value 
cluster). 

Any method of assigning feature value cluster weight that takes into account the user's 
rating of the item and the existence of the feature value cluster for that item is sufficient, however, 
it is currently preferred to assign feature value cluster weights by summing all of the item ratings ' 
that a user has entered and dividing by the number of feature value clusters. Expressed 
mathematically, the vector weight for cluster x of feature a for user I is: 
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Z-i^l ^Lp^fp 

00 otherwise 



where is a boolean operator indicating whether item p contains the feature value cluster x of 
feature a. 

The feature value cluster weight is used, in turn, to define a feature weight. The feature 
weight reflects the importance of that feature relative to the other features for a particular feature. 
Any metiiod of estimating a feature weight can be used; for example, feature weights may be 
defined as the reciprocal of the number of features defined for all items. It is prefeired that 
feature weights are defined as tiie standard deviation of all cluster weight divided by the means of 
all cluster weights. Expressed matiiematically: 

StandardDevfc^/ J 
Mean^C^jj 

The feature value cluster weights and the feature weights are used to calculate the 
similarity factor between two users. The similarity factor between two users may be calculated by 
any metiiod tiiat takes into account the assigned weights. For example, any of the metiiods for 
calculating tiie similarity between two users, as described above, may be used provided tiiey are 
augmented by the feature weights and feature vahie weights. Thus 



\F*aturaD^fin*d\ 



represents the similarity between users I and J, where r ^ [Dj j) is a boolean operator on a 
vector of values indicating whether feature value cluster of x for feature a of the vector is defined 



and where 
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Otherwise 



The representation of an item as a set of feature values aUows the application of various 
feature-based similarity metrics between items. Two items may not share any identical feature 
values but still be considered quite similar to each other if they share some feature value chisters. 
This allows the recommendation of unrated items to a user based on the unrated items similarity 
to other items which the user has already rated highly. 

The simUarity between two items p, and pa, where P, and Pj represent the corresponding 
sets of feature values possessed by these items, can be represented as some function, f, of the 
following three sets: the number of common feature values shared by the two items; the number 
of feature values that pi possesses that pz does not; and the number of feature values that p2 
possesses that pi does not. 

Thus, the similarity between two items, denoted by S(pi, pa), is represented as: 

Each itran is treated as a vector of feature vahie clusters and the item-item similarity 
metrics are defined as: 

|f°e<mruDerined| |a| 
|F«amnj Defined I |a| 

f{P^-P^)= Z FW;x2](cW°-xy--x(l-;^-.)) 

|F«afiunM Defmedj |a| 

/{P^-Pxh S FW;x2:(cW,'"x(l-;^-{)x;'-5) 

This metric is personalized to each user since the feature weights and cluster weights 
reflect the relative importance of a particular feature value to a user. 
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Another method of defining item-item similarity metrics attempts to take into account the 
case where one pair of items has numerous identical feature values, because if two items share a 
number of idemical feature values, they are more similar to each other then two items that do not 
share feature values. Using this method, f(P,nP2) is defined ; 



as: 



|f-«i(u/«>OdiBed| |a| h7v"| 

/(^.n/>,)= Z Fw; x( 2:(cw«. xr- xr'^) + 2 (r- xr«' )) 

Another method for calculating item-item similarity is to treat each item as a vector of 
feature value clusters and then compute the weighted dot product of the two vectors. Thus. 

where 



The methods described above can be provided as software on any suitable medium that is 
readable by a computing device. The software programs means may be implemented in any 
suitable language such as, C, C-H-, PERL. LISP. ADA, assembly language or machine code. The 
suitable media may be any device capable of storing program means in a computer-readable 
fashion, such as a floppy disk, a hard disk, an optical disk, a CD-ROM, a magnetic tape, a 
memory card, or a removable magnetic drive. 

An apparatus may be provided to recommend items to a user. The apparatus, as shown in 
FIG. 3 has a memoiy element 12 for storing user and item profiles. Memory element 12 can be 
any memory element capable of storing the profiles such as. RAM. EPROM, or magnetic media. 

A means 14 for calculating is provided which calculates the similarity factors between 
users. Calculating means 14 may be specialized hardware to do the calculation or. alternatively, 
calculating means 14 may be a microprocessor or software running on a microprocessor resident 
in a general-purpose computer. 
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Means 16 for selecting is also provided to select neighboring users responsive to the 
similarity factors. Again, specialized hardware or a microprocessor may be provided to 
implement the selecting means 16, however preferred is to provide a software program running on 
a microprocessor resident in a general-purpose computer. Selecting means 16 may be a separate 
microprocessor from calculating means 14 or it may be the same microprocessor. 

A means 1 8 for assigning a weight to each of the neighboring users is provided and can be 
specialized hardware, a separate microprocessor, the same microprocessor as calculating means 
14 and selecting means 16, or a microprocessor resident in a general-purpose computer and 
running software. 

In some embodiments a receiving means is included in the apparatus (not shown in FIG. 
3). Receiving means is any device which receives ratings for items from users. The receiving 
means may be a keyboard or mouse connected to a personal computer. In some embodiments, an 
electronic mail system operating over a local are network or a wide area network forms the 
receiving means. In the preferred embodiment, a Worid Wide Web Page connected to the 
Internet forms the receiving means. 

Also included in the apparatus is means 20 for recommending at least one of the items to 
the users based on the weights assigned to the users, neighboring users and the ratings given to 
the item by the users' neighboring users. Recommendation means 20 may be specialized 
hardware, a microprocessor, or, as above, a microprocessor running software and resident on a 
general-purpose computer. Recommendation means 20 may also comprise an output device such 
as a display, audio output, or printed output. 



:uses 



In another embodiment an apparatus for recommending an item is provided that \ 
feature weights and feature value weights. This apparatus is similar to the one described above 
except that it also includes a means for assigning a feature value cluster weight 22 and a means for 
assigning a feature weight 24 (not shown in FIG. 3). Feature value cluster weight assigning 
means 22 and feature value weight assigning means 24 may be provided as specialized hardware, 
a separate microprocessor, the same microprocessor as the other means, or as a single 
microprocessor in a general purpose computer. 
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FIG. 4 shows the Internet system on which the preferred method and apparatus may be 
used. The server 44 is an apparatus as shown in FIG. 3, and it is preferred that server 40 displays 
a World Wide Web Page when accessed by a user via Internet 42. Server 40 also accepts input 
over the Internet 42. Multiple users 44 may access server 40 simultaneously. 

EXAMPT.F. 

The foUowing example is one way of using the invention, which can be used to 
recommend items in various domains for many items. • By way of example, a new user 44 accesses 
the system via the World Wide Web. The system displays a welcome page, which allows the user 
44 to create an alias to use when accessing the system. Once the user 44 has entered a personal 
alias, the user 44 is asked to rate a number of items, in this example the items to be rated are 
recording artists in the music domaia 

After the user 44 has submitted ratings for various recording artists, the system allows the 
user 44 to enter ratings for additional artists or to request recommendations. If the user 44 
desires to enter ratings for additional artists, the system can provide a Ust of artists the user 44 has 
not yet rated. For the example, the system can simply provide a random listing of artists not yet 
rated by the user 44. Alternatively, the user 44 can request to rate artists that are similar to 
recording artists they have akeady rated, and the system will provide a Ust of similar artists using 
the item similarity values previously calculated by the system. The user can also request to rate 
recording artists from a particular group, e.g. modem jazz. rock, or big band, and the system wiU 
provide the user 44 with a list of artists belonging to that group that the user 44 has not yet rated. 
The user 44 can also request to rate more artists that the user's 44 neighboring users have rated, 
and the system will provide the user 44 with a list of artists by selecting artists rated by the user's 
44 neighboring users. 

The user 44 can request the system to make artist recommendations at any time, and the 
system allows the user 44 to taUor their request based on a number of different factors. Thus, the 
system can recommend artists from various groups that the user's 44 neighboring users have also 
rated highly. Similarly, the system can recommend a predetermined number of artists from a 
particular group that the user will enjoy, e.g. opera singers. Alternatively, the system may 
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combine these approaches and recommend only opera singers that the user's neighboring users 
have rated highly. 

The system allows the user 44 to switch between rating items and receiving 
recommendations many times. The system also provides a messaging function, so that users 44 
may leave messages for other users that are not currently using the system. The system provides 
"chat rooms." which allow users 44 to engage in conversation with other users 44* that are 
currently accessing the system. These features are provided to allow users 44 to communicate 
with one another. The system facilitates user communication by informing a user 44 that another 
user 44' shares an interest in a particular recording artist. Also, the system may infonn a user 44 
that another user 44 that shares an interest in a particular recording artists is currently accessing 
the system, the system will not only inform the user 44, but wiU encourage the user 44 to contact 
the other user 44' that shares the interest. The user 44 may leave the system by leaving the Web 
Page. 

Having described preferred embodiments of the invention, it will now become apparent to 
one of skill in the art that other embodiments incorporating the concepts may be used. It is felt, 
therefore, that these embodiments should not be Umited to disclosed embodiments but rather 
should be limited only by the spirit and scope of the foUowing claims. 
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CLAIMS 



What is claimed is: 



11. A method for recommending an item to one of a plurality of users, the item not yet rated 
2 by the user, the method comprising the steps of 

(a) storing a user profile in a memory for each of a pluraHty of users, wherein the user 
profile includes a plurality of values, each of at least some of the pluraKty of values representing a 
rating given to one of a plurality of items by the user; 

(b) storing an item profile in a memoiy for each of the pluraUty of items, wherein the item 
profile includes a plurality of values, each of at least some of the phirality of vahies representing a 

8 rating given to the item by one of the plurality of users; 

9 (c) calculating, for each of the plurality of users, a phirality of similarity fectors, each of 
the plurality of similarity factors representing the similarity between each user and another one of 

11 the plurality of users; 

12 (d) selecting, for each of the plurality of users, a pluraUty of neighboring users responsive 

13 to the similarity factors; 

14 (e) assignmg a weight to each of die neighboring users; and 
(f) recommending at least one of the pluraUty of items to one of tiie phirality of users 

based on the weights assigned to the user's neighboring users and tiie ratings given to the unrated 
1 7 item by the user's neighboring users. 

1 2. The method of claim 1 , wherein step (c) fiirther comprises: 

2 (c-a) recdving a rating fi-om one of tiie plurality of users for one of tiie phirality of items; 

3 (c-b) updating the rating mer's profile with tiie received rating; 

4 (c-c) updating tiie rated item's profile with tiie received rating; 

5 (c-d) calculating, for tiie rating user, a plurality of similarity factors, each of tiie plurality 

6 of similarity factors representing the similarity between tiie rating user and another user. 

1 3. The metiiod of claim 2 wherein step (c-a) comprises receiving a rating for an item from a 

2 requesting user via a local area network. 

14. The method of claim 2 wherein step (c-a) comprises receiving a rating for an item from a 
2 requesting user via a wide area network. 
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1 5. The method of claim 2 wherein step (oc) further comprises: 

determining, from the rated item's profile, other users from which ratings for the item 

3 have been previously received; and 

4 calculating, for the rating user, a plurality of similarity factors, each of the plurality of 

5 similarity factors representing the simUarity of the rating user to another user that has also rated 

6 the item. 



1 6. The method of claim 1 wherein step (c) further comprises: 

2 (c-a) retrieving the item profile for one of the plurality of items that has been rated by one 

3 of the phirality of users; 



4 (c-b) determining, from the item's profile, other users that have also rated the 



5 



5 



Item; 



(c-c) calculating a similarity factor between the one user and each of the pluraUty of other 

6 users that have also rated the item; 

7 (c-d) repeating steps (c-a) through (c-c) until all items rated by the one user have been 

8 retrieved; and 

9 (c-e) repeating steps (c-a) though (c-d) until similarity factors for each user have 
10 calculated. 

1 7. The method of claim 6 wherein stqj (c-c) fiirther comprises: 

2 subtracting the rating given to tiie item by each of tiie plurality of otiier users from the 

3 rating given to the item by the requesting user; 

4 squaring each rating difference; and 
dividing the sum of the squared differences by the number of otiier users tiiat have also 



6 rated the item. 

1 8. The metiiod of claim 1 wherein step (d) comprises selecting, for each of tiie plurality of 

2 users, a phirality of neighboring users from tiie plurality of otiier users, the similarity factor for 

3 each selected neighboring user being less than a predetermined tiireshold value. 

1 9. The metiiod of claim 8 wherein step (e) fiuther comprises: 

2 subtracting, for each neighboring user, tiie similarity faaor for tiiat neighboring user from 

3 tiie predetermined tiireshold value and dividing each difference by tiie predetermined tiireshold 

4 value. 
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1 10. The method of claim 1 wherein step (f) comprises: 

2 predicting a rating for each item not yet rated by one of the plurality of users by 

3 taking a weighted average of the ratings given to the items by the one user's neighboring users; 

4 and 

5 recommending a predetermined number of items based on the predicted ratings for 

6 those items. 

1 11. The method of claim 1 wherein step (f) comprises: 

2 receiving an item selection from one ofthepluraMty of users; and 
predicting a rating for the selected item by taking a weighted average of the ratings 

given to the selected item by the one user's neighboring users. 



3 



1 12. The method of claim 1 further comprising the step of: 

2 (g) displaying information about recommended items on a display. 

1 13. A method for recommending an item to one of a plurality of users, the item not yet rated 

2 by the user, the method comprising the steps of 

(a) storing a user profile in a memory for each of a pluraUty of users, wherein the user 
profile includes a plurality of vahies, each of at least some of the plurality of values representing a 
rating given to one of a plurality of items by the user; 

(b) storing an item profile in a memory for each of the plurality of items, each of the 
plurality of items belonging to one of a pluraUty of groups, wherein the item profile includes a 
plurality of values, each of at least some of the phirality of values representing a rating given to 

9 the item by one of the plurality of users; 

(c) calculating, for each of the pluraUty of users, a plurality of similarity factors, each of 
the plurality of simUarity fectors representing the similarity between each user and another of the 

12 plurality of users based on item ratings for a particular group; 

1 3 (d) selecting, for each of the pluraUty of users, a plurality of neighboring users with 

1 4 respect to each group, the selection responsive to the similarity factors; 

1 5 (e) assigning a weight to each of the neighboring users; and 

1 6 (f) recommending an item to one of the plurality of users based on the weights assigned to 

1 7 the user's neighboring users and the ratings given to the unrated item by the user's neighboring 

1 8 users. 



3 
4 
5 
6 
7 
8 



10 
11 
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1 14. The method of claim 13 wherein step (c) further comprises: 

2 (c-a) retrieving the item profile for one of the plurality of items that has been rated by one 

3 of the plurality of users; 

4 (c-b) determining, fi-om the item's profile, other users that have also rated the item; 
(c-c) calculating a similarity factor between the one user and each of the plurality of other 

6 users that have also rated the item; 

7 (c-d) repeating steps (c-a) through (c-c) until all items rated by the one user for the one 

8 group have been retrieved; and 

9 (c-e) repeating steps (c-a) though (c-d) untU similarity factors for each user have 
10 calculated. 

1 1 5. The method of claim 14 wherein step (c-c) further comprises calculating a simUarity factor 

2 between the one user and each of the pluraUty of other users that have also rated the item, said 

3 similarity factor based only on ratings for other items belonging to the same group. 

1 16. The method ofclaim 13 wherein step (d) comprises selecting, for each ofthe pluraUty of 

2 users, a plurality of neighboring users in each group, each selected ndghboring user having a 

3 similarity factor less than a predetermined threshold value. 

1 17. The method of claim 1 6 wherein step (e) further comprises: 

2 subtracting, for each neighboring user, the similarity factor for that neighboring user fi-om 

3 a predetermined threshold value and dividing each difference by the predetermined threshold 

4 value. 

1 1 8. The method ofclaim 13 wherein step (f) comprises: 

2 predicting a rating for each item in one of the plurality of groups not yet rated by one of 

3 the plurality of users by taking a weighted average ofthe ratings given to the items in the group 

4 by the one user's neighboring users; and 

5 recommending a predetermined number of items fi-om the group based on the predicted 

6 ratings for those items. 



1 

2 



19. 



The method ofclaim 13 wherein step (f) comprises: 

receiving an item selection fi-om one ofthe plurality of users; and 
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predicting a rating for the selected hem by taking a weighted average of the ratings given 
to the selected item by the one user's neighboring users for the group. 



1 20. The method of claim 13 wherein step (f) comprises: 

2 receiving a group selection from one of the plurality of users; 
predicting a rating for items in the selected group by taking a weighted average of the 

ratings given to the items in the group by the one user's neighboring users for that group; and 
recommending a predetermined number of items in the group based on the predicted 
6 ratings for those items. 



The method of claim 13 fijrther comprising the step of 

(g) displaying information about recommended items on a display. 

The method of claim 13 wherein an item belongs to multiple groups. 



1 21. 
2 

1 22. 

1 23 . A method for recommending, to one of a plurality of users, other users, the method 

2 comprising the steps of 

(a) storing a user profile in a memory for each of a plurality of users, wherein the user 
profile includes a plurality of values, each of at least some of the pluraUty of values representing a 
rating given to one of a plurality of items by the user, 

(b) storing an item profile in a memory for each of the plurality of items, each of the 
plurality of items belonging to one of a pluraUty of groups, wherein the item profile inchides a 
plurality of values, each of at least some of the plurality of values representing a rating given to 
the item by one of the plurality of users; 

(c) calculating, for each of the plurality of users, a plurality of similarity factors, each of 
the pluraUty of similarity factors representing the similarity between each user and another one of 
the plurality of users based on the item ratings for a particular group; 

(d) recommending at least one of the neighboring users to one of the plurality of users 
based on the similarity factors. 

24. The method of claim 23 wherein step (d) further comprises recommending at least one of 
the neighboring users to one of the plurality if users based on the similarity factors and the number 
of items rated by both the one user and the at least one neighboring user. 
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25. A method for predicting a rating for an item, the item having at least one feature defining a 
2 characteristic of the item and having a plurality of possible values that may be grouped into 
clusters of feature values, the method comprising the steps of 

(a) storing a user profile in a memory for each of a plurality of users, wherein the user 

5 profile includes a pluraUty of values, each of at least some of the pluraUty of values representing a 

6 rating given to one of a plurality of items by the user, 

7 (b) storing an item profile in a memory for each of the plurality of items, wherein the item 

8 profile includes a plurality of values, each of at least some of the plurality of values representing a 

9 rating given to the item by one of the plurality of users; 

1 0 (c) assigning, for each user, a weight to each value cluster within each feature; 

1 1 (d) assigning, for each user, a weight to each feature; 

12 (e) calculating, for each of the phirality of users, a plurality of sirmlarity factors, each of 

13 the plurality of similarity factors based on the feature weights, the cluster weights, and the ratings 

14 given to items by the respective users; 

15 (f) selecting, for each of the plurality of users, a plurality of neighboring users responsive 

16 to the similarity factors; 

1 7 (g) assigning a weight to each of the neighboring users; and 

1 8 (h) recommending an item to one of the plurality of users based on the weights assigned to 
the user's neighboring users and the ratings given to the unrated item by the user's neighboring 



19 

20 users 



1 26. The method of claim 25 wherein step (c) further comprises: 

2 assigning, for each user, a weight to each value chister within each feature based on the 

3 rating given to the item by the user and the number of feature value clusters present. 

1 27. The method of claim 25 wherein step (d) fiirther comprises: 

2 assigning, for each user, a weight to each feature based on the number of features defined 

3 for the item. 

1 28. The method of claim 25 wherein step (d) further comprises: 

2 assigning, for each user, a weight to each feature based on the weights assigned to each 

3 feature value cluster. 
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1 29. The method of claim 28 the feature weight is assigned by dividing the standard deviation 

2 of the cluster weights by the mean of the cluster weights. 

1 30. The method of claim 25 wherein step (c) further comprises the steps of: 

2 (c-a) calculating, for each feature value, a sunilarity factor to all other feature values in a 

3 feature; and 

4 (c-b) grouping the feature values into clusters based on the similarity factors. 

1 31. A method for recommending an item to one ofa plurality ofusers, the item having at least 

2 one feature defining a characteristic of the item and having a plurality of possible values that may 

3 be grouped into clusters of feature values, the method comprising the steps of: 

4 (a) storing a user profile in a memory for each ofa plurality ofusers, wherein the user 

5 profile includes a plurality of values, each of at least some of the plurality of vahies representing a 

6 rating given to one of a plurality of items by the user; 

7 (b) storing an item profile in a memory for each of the plurality of items, wherein the item 

8 profile includes a pluraUty of values, each of at least some of the plurality of values representing a 

9 rating given to the item by one of the plurality ofusers; 

1 0 (c) assigning, for at least one user, a weight to each value cluster within each feature; 

^ ' (d) assigning, for the at least one user, a weight to each feature; 

12 (e) calculating, for each of the phjrality of items, a plurality of similarity factors, each of 

1 3 the plurality of similarity factors based on the feature weights and the cluster weights; 

14 (f) selecting an item for which a favorable rating has been received fi-om the at least one 

15 user; 

1 6 (g) selecting plurality of items responsive to the selected item and the similarity factors; 

1 7 (h) recommending at least one of the selected items to the at least one user. 

1 32. 



2 



An article of manufacture having program means for recommending an item embodied 
therein, the article of manufacture comprising: 

3 computer-readable program means for storing a user profile in a memory for each ofa 

4 plurality ofusers, wherein the user profile includes a phirality of values, each of at least some of 

5 the plurality of values representing a rating given to one ofa plurality of items by the user; 
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computer-readable program means for storing an item profile in a memory for each of the 
plurality of items, wherein the item profile includes a pluraUty of values, each of at least some of 
the plurality of values representing a rating given to the item by one of the plurality of users; 

computer-readable program means for calculating, for each of the plurality of users, a 
phirality of similarity factors, each of the pluraUty of similarity factors representing the similarity 

1 1 between each user and another one of the plurality of users; 

12 computer-readable program means for selecting, for each of the plurality of users, a 

13 plurality of neighboring users responsive to the similarity factors; 

14 computer-readable program means for assigning a weight to each of the neighboring 

15 users; and 

computer-readable program means for recommending at least one of the pluralit>' of items 
to one of the phirality of users based on the weights assigned to the user's neighboring users and 
the ratings given to the unrated item by the user's neighboring users. 

133. An article of manufacture having program means embodied tiierein. tiie program means 

2 for predicting a rating for an item, tiie hem having at least one feature defining a characteristic of 

3 tiie item and having a plurality of possible values tiiat may be grouped into clusters of feature 

4 vahies, the article of manufacture comprismg: 
computer-readable program means for storing a user profile in a memoiy for each of a 

plurality of users, wherein tiie user profile includes a plurality of values, each of at least some of 
tiie phiraHty of values representing a rating given to one of a pluraUty of items by tiie user; 

computer-readable program means for storing an item profile in a memoiy for each of the 
plurality of items, wherein tiie item profile includes a pluraUty of values, each of at least some of 
tiie pluraUty of values representing a rating given to tiie item by one of tiie pluraUty of users; 
computer-readable program means for assigning, for each user, a weight to each value 

12 cluster within each feature; 

13 computer-readable program means for assigning, for each user, a weight to each feature; 
computer-readable program means for calculating, for each of tiie plurality of users, a 

plurality of simUarity factors, each of tiie plurality of similarity factors based on the feature 
weights, tiie cluster weights, and tiie ratings given to items by tiie respective users; 

computer-readable program means for selecting, for each of tiie plurality of users, a 
1 8 plurality of neighboring users responsive to tiie similarity factors; 



16 
17 
18 
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computer-readable program means for assigning a weight to each of the neighboring 
20 users; and 



21 



computer-readable program means for recommending an item to one of the plurality of 

22 users based on the weights assigned to the user's neighboring users and the ratings given to the 

23 unrated item by the user's neighboring users. 

1 34. An apparatus for recommending an item to one of a plurality of users, the item not yet 

2 rated by the user, comprising: 

a memory element for storing user profiles, wherein each user profile includes a plurality 
of values, each of at least some of the plurality of vahies representing a rating given to one of a 

plurality of items by the user, 

a memory element for storing item profiles, wherein each item profile includes a pluraUty 
of values, each of at least some of the pluraKty of vahies representing a rating given to the item by 
8 one of the plurality of users; 

means for calculating, for each of the plurality of users, a plurality of similarity factors, 
each of the plurality of similarity factors representing the sunilarity between each user and another 

1 1 one of the plurality of users; 

12 means for selecting, for each of the pluraUty of users, a pluraUty of neighboring users 

1 3 responsive to the similarity factors; 
means for assigning a weight to each of the neighboring users; and 
means for recommending at least one of the plurality of items to one of the plurality of 
users based on the weights assigned to the user's neighboring users and the ratings given 
to the unrated item by the user's neighboring users. 



9 
10 



14 
15 
16 
17 



135. An apparatus for recommending an item, the item having at least one feature defining a 

2 characteristic of the item and having a plurality of possible values that may be grouped into 

3 clusters of feature vahies. comprising: 

4 a memory element for storing user profiles, wherein each user profile includes a plurality 

5 of values, each of at least some of the pluraUty of values representing a rating given to one of a 

6 pluraUty of items by the user; 
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a memory element for storing item profiles, wherein each item profile includes a plurality 
of values, each of at least some of the pluraUty of values representing a rating given to the item by 
one of the plurality of users; 

means for assigning, for each user, a weight to each value cluster within each feature; 
means for assigning, for each user, a weight to each feature; 

means for calculating, for each of the plurality of users, a plurality of similarity factors, 
each of the plurality of similarity factors based on the feature weights, the cluster weights, and the 
ratings given to items by the respective users; 

means for selecting, for each of the plurality of users, a plurality of neighboring users 

responsive to the similarity fectors; 

means for assigning a weight to each of the neighboring users; and 

means for recommending an item to one of the plurality of users based on the weights 

assigned to the user's neighboring users and the ratings given to the unrated item by the user's 

neighboring users. 
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